首页> 外文OA文献 >MIT-QCRI Arabic Dialect Identification System for the 2017 Multi-Genre Broadcast Challenge
【2h】

MIT-QCRI Arabic Dialect Identification System for the 2017 Multi-Genre Broadcast Challenge

机译:mIT-QCRI 2017年多种类型的阿拉伯语方言识别系统   广播挑战赛

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。
获取外文期刊封面目录资料

摘要

In order to successfully annotate the Arabic speech con- tent found inopen-domain media broadcasts, it is essential to be able to process a diverseset of Arabic dialects. For the 2017 Multi-Genre Broadcast challenge (MGB-3)there were two possible tasks: Arabic speech recognition, and Arabic DialectIdentification (ADI). In this paper, we describe our efforts to create an ADIsystem for the MGB-3 challenge, with the goal of distinguishing amongst fourmajor Arabic dialects, as well as Modern Standard Arabic. Our research fo-cused on dialect variability and domain mismatches between the training andtest domain. In order to achieve a robust ADI system, we explored both Siameseneural network models to learn similarity and dissimilarities among Arabicdialects, as well as i-vector post-processing to adapt domain mismatches. BothAcoustic and linguistic features were used for the final MGB-3 submissions,with the best primary system achieving 75% accuracy on the official 10hr testset.
机译:为了成功注释在开放域媒体广播中发现的阿拉伯语语音内容,必须能够处理各种阿拉伯语方言。对于2017年多类型广播挑战(MGB-3),有两个可能的任务:阿拉伯语音识别和阿拉伯方言识别(ADI)。在本文中,我们描述了我们为创建MGB-3挑战的ADI系统而做出的努力,目的是区分四种主要的阿拉伯方言以及现代标准阿拉伯语。我们的研究侧重于训练和测试领域之间的方言变异性和领域不匹配。为了实现强大的ADI系统,我们探索了Siameseneural网络模型以学习阿拉伯方言之间的相似性和相异性,以及i-vector后处理以适应域失配。最终的MGB-3提交使用了声学和语言功能,最好的主系统在官方的10小时测试集上达到了75%的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号